Python : File I/O 
reading and writing files 


CTO10-3-1 Fundamentals of 
Software Development 


Topic & Structure of the 
lesson 


° Reading and writing files 
— Creating a text file 
— Opening files in different modes 
— Writing data into a file 
— Reading from a file 
— Searching through a file 


Python Files I/O 


Learning outcomes 


° At the end of this lecture you should 
be able to: 


— Develop a problem-based strategy for 


creating and applying programmed 
solutions 


— Create, edit, compile, run, debug and 
test programs using an appropriate 
development environment 


Python Files I/O 


Key terms you must be able 
to use 


° If you have mastered this topic, you 
Should be able to use the following 


terms correctly in your assignments 
and exams: 


— open 
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File Processing 


° A text file can be thought of as a 


sequence of lines 

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 
2008 

Return-Path: <postmaster@collab.sakaiproject.org> 
Date: Sat, 5 Jan 2008 09:12:18 -O500To: 
source@collab.sakaiproject.orgFrom: 
stephen.marquard@uct.ac.zaSubject: [sakai] svn 
commit: r39772 - content/branches/Details: 
http://source.sakaiproject.org/viewsvn/? 
view=rev&rev=39772 


http://www. py4inf.com/code/mbox-short.txt 
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Opening a File 


Before we read the contents of a file we 
must tell Python which file we are going 
to work with and what we will be doing 
with the file 

This is done with the open() function 
open() returns a “file handle” - a variable 
used to perform operations on the file 
Kind of like “File -> Open” in a Word 
Processor 


ndamentals of Software Python Files I/O 


Using open() 


handle = open(filename, mode) 
e returns a handle, used to manipulate the file 
e filename is a string 


e mode is optional and should be 'r' if we are 
planning reading the file and 'w' if we are 
going to write to the file. 


fhand = open('mbox.txt', ‘r') 


http://docs.python.org/lib/built-in-funcs.html 
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What is a Handle? 


A: P:U 
ASIA PACIFIC UNIVERSITY 
OF TECHNOLOGY & INNOVATION 


>>> fhand = open('mbox.txt' ) 
>>> print(fhand) 
<open file 'mbox.txt', mode 'r' at 0x1005088b0> 


mbox.txt 


open From stephen.m.. 
read = Return-Path: <p.. 


Date: Sat, 5 Jan .. 
write _ 


To: source@coll.. 
ci ae 


From: stephen... 
Subject: [sakai].. 
Details: http:/... 


Your 


Program 


CTO10- 3-1 Fundamentals of Software Python Files I/O 


EA FAN | | A eee re RT Te 


When Files are Missing 


>>> fhand = open('stuff.txt') 
Traceback (most recent call 
Last): File "<stdin>", line 
1, in <module>I0Error: [Errno 
2] No such file or directory: 
'Stuff.txt' 
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The newline 
Character 


e We use a special 
character to 
Indicate when a 
line ends called the 
"newline" 


e We represent it as \ 
n in strings 

e Newline is still one 
character - not two 


EA EEN E NA A a P 


Python Files I/O 


© 


A: RP: U 
ASIA PACIFIC UNIVERSITY 
OF TECHNOLOGY & INNOVATION 


>>> stuff = 'Hello\nWorld!’ 
>>> stuff 

>>> 'Hello\nWorld!’ 
>>> print(stuff) 
Hello 

World! 

>>> stuff = 'X\nY’ 
>>> print(stuff) 

X 

Y 

>>> len(stuff)3 


File Processing 


e A text file can be thought of as a 
sequence of lines 


From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 
2008 

Return-Path: <postmaster@collab.sakaiproject.org> 
Date: Sat, 5 Jan 2008 09:12:18 -O500To: 
source@collab.sakaiproject.orgFrom: 
stephen.marquard@uct.ac.zaSubject: [sakai] svn 
commit: r39772 - content/branches/Details: 
http://source.sakaliproject.org/viewsvn/? 
view=rev&rev=39772 
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File Processing 


e A text file has newlines at the end 


of each line 
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 
2008\n 


Return-Path: <postmaster@collab.sakaiproject.org>\n 
Date: Sat, 5 Jan 2008 09:12:18 -0500\nTo: 
source@collab.sakaiproject.org\nFrom: 
stephen.marquard@uct.ac.za\nSubject: [sakai] svn 
commit: r39772 - content/branches/\nDetalls: 
http://source.sakaliproject.org/viewsvn/? 
view=rev&rev=39772\n 
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File Handle as a Sequence 


e A file handle open for 
read can be treated as a 
sequence of — strings 
where each line tn the 
file is a string in the 
sequence 

e We can use the for 
statement to _ iterate 
through a sequence 

e Remember - a sequence 
is an ordered set xfile = open( ‘mbox. txt' ) 

for cheese in xfile: 
print (cheese) 
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Counting Lines in a File 


e Open a file read- 


onl 
y fhand = open('mbox.txt' ) 
e Use a for loop to count = 0 


read each line for lane in fhand: 
count = count + 1 


e Count the lines print('Line Count:', count) 
and print out the 
number of lines output: 
Line Count: 132045 
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Searching Through a File 


e We can put an If 
Statement in our 
for loop to only 
print lines that 
meet some 


criteria Thand = open('mbox-short.txt' ) 
for line in fhand: 
if line.startswith('From:') : 
print(line) 
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OOPS! 


What are all these blank 
lines doing here? 


From: 


From: 


From: 


From: 


A:P:U 


ASIA PACIFIC UNIVERSITY 
OF TECHNOLOGY & INNOVATION 


stephen.marquard@uct.ac.za 
louls@media.berkeley.edu 


zqian@umich.edu 


rjlowe@iupul.edu 


OOPS! 


What are all these blank 
lines doing here? 


Each line from the file 
has a newline at the end. 


The print statement adds 
a newline to each line. 
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From: 


\n 


From: 


\n 


From: 


\n 


From: 


\n 
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stephen.marquard@uct.ac.za\n 
louis@media.berkeley.edu\n 
zqian@umich.edu\n 


rjlowe@Iiupul.edu\n 


Searching Through a File @ 
(fixed) ALP 


OF TECHNOLOGY & INNOVATION 


We can strip the Ban) EEEE 
' mM and = open('mbox-short.txt' 
whitespace fro for line in fhand: 


the right hand side Line = Line. rstrip() 

of the string using if line. startswith ( 'From:'): 
rstrip() from the Prat AATRE 

string library 


e The newline is 


considered "white n a daua 
it rom: stephen.marquard@uct.ac.za 
space and Is From: louis@media.berkeley.edu 
Stripped From: zqian@umich.edu 
From: rjlowe@iupul.edu 
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Skipping with continue 


e We can 
convienentl 
y Skip a fhand = open('mbox-short.txt' ) 
: for line in fhand: 
line by ree : ine rete 
: if not line.startswith('From:'): 
USING the continue 
continue peana 
statement 
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Using tn to select lines @ 


A: P:U 
ASIA PACIFIC UNIVERSITY 
OF TECHNOLOGY & INNOVATION 


e We can look 
for 3 string fhand = open('mbox-short.txt' ) 


for line in fhand: 


anywhere in a Line = line.rstrip() 


if not ‘@uct.ac.za' in line : 


line as our pation \ 
n rın ine 

selection P 

criteria 


From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008 
X-Authentication-Warning: set sender to stephen.marquard@uct.ac.za using -f 
From: stephen.marquard@uct.ac.zaAuthor: stephen.marquard@uct.ac.za 
From david.horwitz@uct.ac.za Fri Jan 4 07:02:32 2008 
X-Authentication-Warning: set sender to david.horwitz@uct.ac.za using -f... 
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Prompt for File Name on 


ASIA PACIFIC UNIVERSITY 
OF TECHNOLOGY & INNOVATION 


fname = input('Enter the file name: ‘') 
fhand = open( fname) 
count = 0 


for line in fhand: 
if line.startswith('Subject:') 
count = count + 1 
print('There were', count, ‘subject lines in', fname) 


Enter the file name: mbox.txt 
There were 1797 subject lines in mbox.txt 


Enter the file name: mbox-short.txt 
There were 27 subject lines in mbox-short.txt 
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Bad File Names 


fname = input('Enter the file name: ‘') As Ps 

try: ASIA PACIFIC UNIVERSITY 
fhand = open( fname) 

except: 


print ‘File cannot be opened:', fname 
exit () 


count = 0 
for line in fhand: 
if line.startswith('Subject:') 
count = count + 1 
print ('There were', count, ‘subject lines in', fname) 


Enter the file name: mbox.txt 
There were 1797 subject lines in mbox.txt 


Enter the file name: na na boo boo 
File cannot be opened: na na boo boo 
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Summary 


e Secondary storage 

e Opening a file - file handle 

e File structure - newline character 

e Reading a file line-by-line with a for 
loop 

e Searching for lines 

e Reading file names 

e Dealing with bad files 
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